home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Collection of Tools & Utilities
/
Collection of Tools and Utilities.iso
/
pascal
/
tpu2asm.zip
/
TPU2ASM.DOC
< prev
next >
Wrap
Text File
|
1990-10-29
|
16KB
|
313 lines
┌─────────────────────────────────────────────────────────────────────────────┐
│ │
│ Documentation for │
│ │
│ TPU2ASM.EXE │
│ │
│ A symbolic disassembler for │
│ Turbo Pascal version 5.0 units │
│ │
│ Copyright (C) 1989 by Per Bent Larsen │
│ All rights reserved │
│ │
│ Version 1.0, March 1989 │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
What it is
──────────────────────────────────────────────────────────────────────────────
TPU2ASM is a symbolic disassembler capable of extracting compiled code from a
version 5.0 Turbo Pascal TPU file. The output is a TASM compatible assembly
file.
What it can be used for
──────────────────────────────────────────────────────────────────────────────
The utility can be used as an aid in developing EXTERNAL procedures and func-
tions for Turbo Pascal, allowing the programmer to write a template function or
procedure with parameter and local variable allocation set up automatically by
the compiler.
You can also choose to write the entire routine in pascal and then do manual
optimization on the code produced by the compiler. The code generated by Turbo
Pascal leaves much room for improvement as you will soon find out.
Another use for the utility is studying the code generated by the compiler to
determine which pascal constructs are preferable to others in terms of execu-
tion speed and code size. You'll be surprised.
Finally this utility makes it theoretically possible to convert an entire pro-
gram to assembler thus allowing you to link it with other languages like C or
PROLOG. To do this you must have the Turbo Pascal Runtime Library Source avai-
lable from Borland Int'l. It requires a conversion of the RTL into pure assem-
bler. Be warned - this may not be a trivial process.
What it isn't
──────────────────────────────────────────────────────────────────────────────
This utility was not meant to be a reverse engineering tool, though it can be
used as such to a limited extend. All references are shown symbolically and
when a symbol is not known due to lack of information in the symbol tables, the
program does not attempt to make a symbol of it's own. When TPU2ASM can not
find the symbol for a reference, it is always because one ore more units were
compiled with $D- or $L-. References that can not be resolved symbolically are
left as ???? in the output file. Exceptions to this are items which are always
unnamed like the CS constant block. These items are named by TPU2ASM.
How to use it
──────────────────────────────────────────────────────────────────────────────
The program is invoked from the DOS prompt as follows:
TPU2ASM UnitName OutputFile [Procedure|Function]
The UnitName is of course the name of the unit to extract from. You should not
specify an extension. The unit is assumed to be in a file with the same name
and a TPU extension or in TURBO.TPL.
The OutputFile specifies where the assembly should go to. The extension de-
faults to ASM. If the file exists you are prompted for overwrite permission.
The last parameter is an optional procedure or function name. If you omit this
entry it is assumed that you wish to extract all procedures and functions from
the unit. You are prompted for verification in this case. Only global procedu-
res and functions can be extracted. This is because local routines always have
an access link to the local data in the owner routine, more on that later. Glo-
bal routines will pull all their local routines into the disassembly as well.
Limitations
──────────────────────────────────────────────────────────────────────────────
TPU2ASM reads the symbol table from all referenced units into RAM. This imposes
a restriction on the number and size of units referenced. However I haven't
encountered any heap overflows on my 640K machine even when extracting from
very large units with many references. Only symbol tables are kept in RAM, not
entire units.
Extracting EXTERNAL routines can not always be done correctly. A Turbo Pascal
procedure or function is always stored in the TPU as a single block. If the
entry point is zero, the entire block is known to be executable code, otherwise
the first part of the block is known to be CS constants (literal data like
strings, sets, range checking values and so on) and the rest is executable
code. Not so with externals. OBJ files can have multiple routines in them and
they can have data areas anywhere in both the data and code segments. Turbo
Pascal does not keep track of anything but the data size, the code size (inclu-
ding CS data) and the entry points for the PUBLIC routines. When extracting EX-
TERNALS from a TPU, TPU2ASM initially assumes that the entire block up to the
entry point is data. When it does the disassembly, it checks for jumps and
calls into the assumed data part. If such a branch is encountered, the disas-
sembly process is restarted with this new address as starting point. This pro-
cess is repeated as required. The consequence of this strategy is that if the
external module has multiple PUBLIC symbols, you may get the same routine in
the output more than once. Also any data after the starting point is assumed
to be code.
TPU2ASM attempts to create an ASM file which can be compiled directly by TASM.
However TPU2ASM does not attempt to solve symbolic conflicts between Turbo
Pascal and TASM. There are two types of errors likely to occur in an ASM file
produced by TPU2ASM: Either you have used an identifier name in you pascal pro-
gram which happens to be a reserved word in TASM, or you have local and global
identifiers with duplicate names. TASM does not handle scope concepts as does
pascal. Both types of errors can be dealt with by choosing other identifier
names in the pascal source.
Symbols defined by TPU2ASM
──────────────────────────────────────────────────────────────────────────────
When disassembling a code or data reference, TPU2ASM uses the symbol from the
pascal source if it's available. Data and code external to the unit is referen-
ced by the pascal symbol directly. Parameters and local variables have a BP
EQUate immediately following the PROC header, and all subsequent references are
done with the EQU symbol. All local variables and labels are preceded with @@
to avoid conflict with other PROCs. When disassembling functions, TPU2ASM de-
fines a BP EQUate to $RESULT which denotes the function result buffer. When
disassembling local routines, an additional BP symbol $LINK is defined, which
denotes the Base Page link to the calling routines local data. Local label
names are defined as @@<number>: , the exception being when call or jumps out-
side the PROC-ENDP pair is encountered (this can only occur in EXTERNALS). In
this case the label names are defined as <proc name><number>$. If the unit has
initialization code, it is given the name <UnitName>$INIT.
TASM offers a TPASCAL model directive to simplify creation of EXTERNALs for
Turbo Pascal. TPU2ASM does not use it, however. This is to give you complete
flexibility to change the code. When you use TPASCAL, TASM always sets up a
so called base frame, which is not always necessary.
References to the SYSTEM unit
───────────────────────────────────────────────────